{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Training Neural Networks with MinPy Solver\n",
    "\n",
    "This notebook demonstrates how to use MinPy's solver to train a 2-layer fully connected network. MinPy supports NumPy-style syntax, which enables researchers to specify network architectures in a flexible way. MinPy's solver significantly simplifies the training procedure. Once the model is specified, developers can start training the model instantly via built-in solver.\n",
    "\n",
    "Below is the general pipeline of training neural networks through MinPy's solver architecture:\n",
    "1. Define a class to specify the neural network architecture. The class is supposed to inherit from `minpy.nn.model.ModelBase`, which provides interfaces for model's compatibility with MinPy solver.\n",
    "2. Load and convert the data to by a `minpy.nn.io.NDArrayIter` instance.\n",
    "3. Pass `NDArrayIter` and neural network objects to an instance of `minpy.nn.solver.Solver`, initialize parameters (`solver.init()`) and start training (`solver.train()`).\n",
    "\n",
    "This notebook will guide you through all the procedures."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Some Imports\n",
    "Import MinPy's core system: its NumPy interface. Also import MinPy's `nn` package for neural networks and data utility."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import minpy.numpy as np\n",
    "from minpy.nn import layers\n",
    "from minpy.nn.model import ModelBase\n",
    "from minpy.nn.solver import Solver\n",
    "from minpy.nn.io import NDArrayIter\n",
    "from examples.utils.data_utils import get_CIFAR10_data"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Define Hyperparameters.\n",
    "You may try different settings by changing hyperparameters below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Define hyperparameters regarding network architecture.\n",
    "input_size           = (3, 32, 32)\n",
    "flattened_input_size = 3 * 32 * 32\n",
    "hidden_size          = 512\n",
    "num_classes          = 10\n",
    "\n",
    "# Define hyperparameters regarding training data.\n",
    "batch_size = 128\n",
    "\n",
    "# Define hyperparameters regarding optimizer.\n",
    "learning_rate = 1e-4"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Specify Neural Network Architecture\n",
    "To specify a model structure, one should define a class that inherits from `minpy.nn.model.ModelBase` and\n",
    "1. In initializer, call the initializer of base class.\n",
    "2. In initializer, call `self.add_param` to specify the names and shapes of trainable parameters.\n",
    "3. Define `forward` function, which simply receives input data and mode and return outputs. It is recommended to use layers provided in `minpy.nn.layers` if possible.\n",
    "4. Define `loss` function, which receives network prediction and ground truth and outputs loss. It is recommended to simplify this function by using loss functions defined in `minpy.nn.layers`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "class TwoLayerNet(ModelBase):\n",
    "    def __init__(self):\n",
    "        super(TwoLayerNet, self).__init__()\n",
    "        # Define model parameters.\n",
    "        self.add_param(name='w1', shape=(flattened_input_size, hidden_size)) \\\n",
    "            .add_param(name='b1', shape=(hidden_size,)) \\\n",
    "            .add_param(name='w2', shape=(hidden_size, num_classes)) \\\n",
    "            .add_param(name='b2', shape=(num_classes,))\n",
    "\n",
    "    def forward(self, X, mode):\n",
    "        # X is the input and mode is a string, which is either 'training' or 'test'.\n",
    "        # Flatten the input data to matrix.\n",
    "        X = np.reshape(X, (batch_size, 3 * 32 * 32))\n",
    "        # First affine layer (fully-connected layer).\n",
    "        y1 = layers.affine(X, self.params['w1'], self.params['b1'])\n",
    "        # ReLU activation.\n",
    "        y2 = layers.relu(y1)\n",
    "        # Second affine layer.\n",
    "        y3 = layers.affine(y2, self.params['w2'], self.params['b2'])\n",
    "        return y3\n",
    "\n",
    "    def loss(self, predict, y):\n",
    "        # Compute softmax loss between the output and the label.\n",
    "        return layers.softmax_loss(predict, y)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Load Data\n",
    "Please follow these steps to load CIFAR-10 data set and convert data to the format supported by Minpy's solver. Please specify the location of data set in data_dir."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# Create data iterators for training and testing sets.\n",
    "data_dir = 'cifar'\n",
    "data = get_CIFAR10_data(data_dir)\n",
    "train_dataiter = NDArrayIter(\n",
    "    data       = data['X_train'],\n",
    "    label      = data['y_train'],\n",
    "    batch_size = batch_size,\n",
    "    shuffle=True\n",
    ")\n",
    "test_dataiter = NDArrayIter(\n",
    "    data       = data['X_test'],\n",
    "    label      = data['y_test'],\n",
    "    batch_size = batch_size,\n",
    "    shuffle    = False\n",
    ")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Start Training!\n",
    "Create model instance, specify the configurations of Minpy's solver, initialize model parameters, and start training!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "    # Create model.\n",
    "    model = TwoLayerNet()\n",
    "    # Create solver.\n",
    "    solver = Solver(\n",
    "        model,\n",
    "        train_dataiter,\n",
    "        test_dataiter,\n",
    "        num_epochs   = 1,\n",
    "        init_rule    = 'gaussian',\n",
    "        init_config  = {'stdvar' : 0.001},\n",
    "        update_rule  = 'sgd',\n",
    "        optim_config = {'learning_rate': learning_rate},\n",
    "        verbose      = True,\n",
    "        print_every  = 20\n",
    "    )\n",
    "    # Initialize model parameters.\n",
    "    solver.init()\n",
    "    # Train!\n",
    "    solver.train()"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python [Root]",
   "language": "python",
   "name": "Python [Root]"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
